maxAlike: Sequence Reconstruction by Maximum Likelihood Estimation
نویسندگان
چکیده
The task of reconstructing a piece of sequence in a particular species is gaining more and more importance in the light of the rapid development of high-throughput sequencing technologies and their limitations. Applications include not only compensation for missing data in unsequenced genomic regions but also the preparation of customized queries for homology-based searches. Here, we introduce the maxAlike web server. It takes a multiple sequence alignment and a phylogenetic tree that also contains a target species as input. For the target species, it computes nucleotide probabilities as well as the most likely sequence, which can be used for primer design or homology search. Furthermore, position specific scoring matrices (PSSMs) of regions of high confidence are available for download. We show that as much as 99% of a sequence can be reconstructed correctly using the maxAlike algorithm, when the sequence of a closely related species is available, compared to only 89% reconstructed positions using only the consensus sequence from the input alignment. For more distant species, the reconstruction rate of maxAlike drops to a plateau value of about 60–70% for the maxAlike approach, compared to 50–60% for the consensus sequence. The web server is freely accessible at: http://rth.ku.dk/resources/maxAlike.
منابع مشابه
maxAlike: maximum likelihood-based sequence reconstruction with application to improved primer design for unknown sequences
MOTIVATION The task of reconstructing a genomic sequence from a particular species is gaining more and more importance in the light of the rapid development of high-throughput sequencing technologies and their limitations. Applications include not only compensation for missing data in unsequenced genomic regions and the design of oligonucleotide primers for target genes in species with lacking ...
متن کاملImproving the Performance of Bayesian Estimation Methods in Estimations of Shift Point and Comparison with MLE Approach
A Bayesian analysis is used to detect a change-point in a sequence of independent random variables from exponential distributions. In This paper, we try to estimate change point which occurs in any sequence of independent exponential observations. The Bayes estimators are derived for change point, the rate of exponential distribution before shift and the rate of exponential distribution after s...
متن کاملA Practical Algorithm for Estimation of the Maximum Likelihood Ancestral Reconstruction Error
The ancestral sequence reconstruction problem asks to predict the DNA or protein sequence of an ancestral species, given the sequences of extant species. Such reconstructions are fundamental to comparative genomics, as they provide information about extant genomes and the process of evolution that gave rise to them. Arguably the best method for ancestral reconstruction is maximum likelihood est...
متن کاملEstimation and Reconstruction Based on Left Censored Data from Pareto Model
In this paper, based on a left censored data from the twoparameter Pareto distribution, maximum likelihood and Bayes estimators for the two unknown parameters are obtained. The problem of reconstruction of the past failure times, either point or interval, in the left-censored set-up, is also considered from Bayesian and non-Bayesian approaches. Two numerical examples and a Monte Carlo simulatio...
متن کاملRegularization Parameter Selection Methods for Ill-Posed Poisson Maximum Likelihood Estimation
In image processing applications, image intensity is often measured via the counting of incident photons emitted by the object of interest. In such cases, image data-noise is accurately modeled by a Poisson distribution. This motivates the use of Poisson maximum likelihood estimation for image reconstruction. However, when the underlying model equation is ill-posed, regularization is needed. Re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010